Excitation Codebook Design for Coding of the Singing Voice

نویسنده

  • Youngmoo E. Kim
چکیده

The technique of Code Excited Linear Prediction (CELP) has led to the development of voice coding systems that provide toll quality speech at very low bitrates. While speech and singing share many similarities in terms of production, standard speech coding implementations fall far short when transmitting the singing voice. This paper explores the reasons for this discrepancy and suggests new variations on CELP speech coders that specifically enhance the quality of encoded singing for individual singers. These modifications could be used in a low-bitrate singing voice codec which, in conjunction with multi-track structured coding schemes such as MPEG-4 Structured Audio, could provide a highly compressed yet high-quality representation of a complex audio scene.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Efficient implementation of ITU-t g.723.1 speech coder for multichannel voice transmission and storage

Dual-rate G.723.1 speech coder has been widely applied to real-time video and teleconferencing applications where reduced bandwidth and good voice quality is required. This paper presents an efficient implementation of G.723.1 speech coder. To simplify the excitation quantization procedure which is the most computationally demanding, we propose fast algorithms for adaptive codebook and fixed co...

متن کامل

An Impulse Sequence Representation of the Excitation Source Characteristics of Nonverbal Speech Sounds

Impulse-sequence representation of the excitation source component of normal speech signal has been of considerable interest in speech coding research. If a similar representation can be made for nonverbal (i.e., nonnormal or nonneutral) speech sounds, that would immensely help in their acoustic analyses and diverse applications. This paper proposes a representation of the excitation source cha...

متن کامل

A DYNAMIC PROGRAMMING APPROACH TO CONTEXT−FREE VOICE TRANSFORMATION (MonAmOR3)

In this paper, we present a dynamic programming approach to voice transformation (VT). The goal of VT is to modify the speech of a source speaker such that it is perceived as if spoken by a target speaker. The speech model used in this work is based on MELP (Mixed Excitation Linear Prediction) speech coding algorithm. The designed system obtains speaker−specific codebooks of line spectral frequ...

متن کامل

Singing Voice Synthesis Based on Deep Neural Networks

Singing voice synthesis techniques have been proposed based on a hidden Markov model (HMM). In these approaches, the spectrum, excitation, and duration of singing voices are simultaneously modeled with context-dependent HMMs and waveforms are generated from the HMMs themselves. However, the quality of the synthesized singing voices still has not reached that of natural singing voices. Deep neur...

متن کامل

Structured Encoding of the Singing Voice Using Prior Knowledge of the Musical Score

The human voice is the most difficul t musical instrument to simulate convincingly. Yet a great deal of progress has been made in voice coding, the parameterization and re-synthesis of a source signal according to an assumed voice model. Source-filter models of the human voice, particularly Linear Predictive Coding (LPC), are the basis of most low-bitrate (speech) coding techniques in use today...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2001